Introduction

ANU BDSI
workshop
Introduction to R programming

Emi Tanaka

Biological Data Science Institute

3rd April 2024

Welcome 👋

Teaching team

Dr. Emi Tanaka

Helper TBD
  • Who are you?
    • What statistical software have you used before?
    • Introduce yourself to people around you

Workshop materials

All materials will be hosted at
https://anu-bdsi.github.io/workshop-intro-R/

Learning objectives

The main aim is for you to get started with using R for basic computations.

  • Perform basic arithmetic calculations in R
  • Understand missing values in R
  • Compute basic summary statistics (e.g., mean, median, quartiles, standard deviation, etc) in R
  • Install contributed packages into R
  • Work with lists, matrix and vectors in R
  • Navigate the RStudio interactive development environment (IDE)
  • Read and write data in R (csv, tsv and xlsx)
  • Understand the different object types in R
  • Saving objects in R (RData and RDS)
  • Write simple functions, conditional statements and for loops in R
  • Decipher error messages and do basic troubleshooting
  • Write minimally reproducible examples

What is R?

  • R is a programming language predominately for data analysis
  • RStudio Desktop is an integrated development environment (IDE) that helps you to use R

How to use R?

  • RStudio Desktop (or RStudio IDE) is the most common way to use R

  • You can type operations directly into the Console pane

Live demo

Customise Global Options

  • Go to RStudio > Tools > Global Options…
  • Under the General tab, make sure the “Restore .RData into workspace at startup” is unticked.
  • This avoids unexpectedly loading (old) data into your workspace and making your code only work in your workspace, but not for others (which is bad reproducible practice).

R Packages

  • R packages are community developed extensions to R (much like apps on your mobile)
  • The Comprehensive R Archive Network (CRAN) is a volunteer maintained repository that hosts submitted R packages that are approved (much like an app store)
    • There are close to 20,000 packages available on CRAN
    • The qualities of R packages vary
  • There are other repositories that host R packages, e.g. Bioconductor for bioinformatics (we won’t cover this)

Photo by Sara Kurfeß on Unsplash

Why learn R?

  • R is one of the top programming languages for statistics or data science
    • Python is also a good alternative language for data science
    • Better to have a mastery of at least one language rather than none
  • R was initially developed by statisticians for statisticians
    • State-of-the-art statistical methods are generally more readily available in R
  • R has an active and friendly community
  • R is a free and open source software (FOSS)
    • free = money is not a barrier to use it
    • open source software = transparency

How to get better at R?

  • PRACTICE
  • Practice with a purpose (e.g. using R on your own data)
  • Try teaching and helping others with their R problem
  • Have a willingness to continuously learn and adapt
    • R is an ever evolving language (check the release news every so often)
    • new features and packages are added very frequently
    • whether you are a beginner or not, there are always things we do not know about R
  • Do you have any strategies or tips? Please share!